[Draft]
Some observations
KNN matrix: - Periods 2011-2012, 2013-2014, and 2019-2020 are most distinctive compared to others.
- 2017-2018 has a minor overlap with 2015-2016. In the context of my corpus, it could mean that my listening habits were similar between those years.
Random forest tree: - Timbre is important (c01, c03, and c11).
- Instrumentalness is also an important factor. Could be due to math rock (usually without vocals in East Asia) and ambient music, of which both genres became prominent on my last.fm starting from 2015-2016.
Accuracy of random forest model:
| Group | Precision | Recall |
|---|---|---|
| 2011-2012 | 0.444 | 0.467 |
| 2013-2014 | 0.824 | 0.875 |
| 2015-2016 | 0.330 | 0.300 |
| 2017-2018 | 0.353 | 0.342 |
| 2019-2020 | 0.484 | 0.492 |
To do: Resize figures. Make facetted plots of top features from the random forest.
What is your corpus, why did you choose it, and what do you think is interesting about it?
Last.fm is an online music database, a music recommender system, and a social networking service, which was founded in the days when MSN, Myspace, and Runescape were still a thing. In general, the website offers a plugin for you to install on your PC and phone, which can track your listening behaviour. One listen, a scrobble, is then transferred (or “scrobbled”) to the database and displayed on your personal profile. Based on the collected data, it could also recommend you new music to discover or connect you to people with similar music taste. Although the social aspects have been watered down, I’ve still been using their service ever since June 2011 (my profile). With a vast amount of data up for grabs, it would be a waste to leave the data as it is. That is why I’m interested in learning more about how my listening has changed over the years.
As of December 31st of 2020, I have approximately over 97.000 registered scrobbles and 24.000 unique tracks over the course of ten years. The size is too big for the scope of this course, so I will be limiting to a set number of top tracks each year. This makes it easier to explore the data without losing much overview of my general listening behaviour.
What are the natural groups or comparison points in your corpus and what is expected between them?
My corpus will be divided in years from June 2011 to the end of 2020.
[Draft: As I grow older, some changes … teenage/adolescent years are formative, some consistencies to be expected … new music styles introduced … new music discovered along the way]
How representative are the tracks in your corpus for the groups you want to compare?
I used Spotlistr and Soundiiz to transfer 60 tracks per year from my last.fm profile to Spotify. As I’ve been listening to albums more than separate songs at some point in life, I decided to grab top 10 tracks and the remaining 50 tracks between #11-100 at random to broaden the scope. Sometimes the tool didn’t pick the correct track due to changes in the metadata, for which I adjusted manually. Examples include band name changes, such as ‘Viet Cong’ to ‘Preoccupations’ and ‘Andrew Jackson Jihad’ to ‘AJJ’. If a top 10 song was missing, then the next song was selected (#11 and so on). I also removed tracks that are considered as intros or interludes.
My corpus comes with a few limitations:
I’ve only started using the last.fm plugin on my smartphone since February 2015. Before then, I relied on my PC/laptop to log my scrobbles, which makes the data between 2011 and 2014 less accurate.
Possible under-representation of certain music styles in my 60 track selections. As a simple example: song 1 of style A has 100 scrobbles, song 2 and 3 of style B have 60 each. In total, style A has 100 scrobbles, whereas style B has 120, which is more than style A.
On a handful occasions, I fell asleep with my music and last.fm still on.
Identify several tracks in your corpus that are either extremely typical or atypical
Typical songs:
Processed by the Boys - Protomartyr: Musical styles (post-punk) from songs like this one have dominated my corpus since 2011.
Ferrum - Chihei Hatakeyama: Typical music (ambient) I listen to when I need to focus during work or study (especially since university).
Atypical songs:
Setsuyakuka - Tricot: Classified as math rock, a genre which gained prominence in my corpus starting from 2016.
Rosebud - U.S. Girls: I don’t listen to a lot of music that is considered ‘pop’. A track such as this one, however, stands high in my last.fm charts.
Gruppa Krovi - Kino: Alongside Tricot the only two only two non-English singing music groups in my all-time top 10.
Fig 1. Logo of last.fm
View my corpus per year:
[W06: Draft]
Things to do:
2014/2: “CE middelbare school” (correspond to lower valence as seen in next figures?)
Pre-2015: Only ‘scrobbles’ from PC/laptop.
2015/1: New phone and subscribed to Spotify with last.fm
2016/3: Start UvA
2018/3: Half year in Hong Kong
Show within each bar the proportion of top-60 songs (probably easier to do per year rather than per quartile).
Turn above text into a readable story.
[W06: Draft]
[W06: Draft]
[W07: Draft]
Outlier: Song’s very reliant on silence with occasional …
Things to consider: Most listened song vs favourite song of 2020. Compare multiple low valence/energy songs and see if they have something in common.
[W07: Draft]
Reasoning on inclusion: In top 2014. Might become an outtake if it doesn’t fit in my portfolio story.
[W08: Draft]
On the left, you see two self-similarity matrices displaying chroma and timbre features of Processed by the Boys by Protomartyr (top track of 2020). As for the settings:segments are set in bars, applied normalization and summary statistics are euclidean and root mean square, respectively. The darker the colour brightness, the more similar the segments are compared to segments before it in time.
Findings:
…
…
Things to consider: Compare above to top track of years 2016, 2018 (Rosebud) or other/more years.
[W09: Draft]
With the previous songs in mind, I have created key- and chordograms of top tracks from 2018 (Rosebud by U.S. Girls) and 2020 (Processed by the Boys by Protomartyr).
The level of prominence of each key/chord is represent by a darker colour.
I used “norm = manhattan; distance = manhattan” for both chordograms, and “norm = manhattan; distance = aitchison” for both keygrams.
Need to work out my findings
Things to consider: Compare above to top track of years 2016 (or other/more years).
On the left, you’ll observe a plotted histogram containing the tempo of every song per two years. The average tempo of every time period are similar to each other, with a range between 125 and 135 bpm. A minor outlier is 2015-2016, which comes around 134 bpm. This coincidentally correlates to 2016 having the highest energy (see 3.1).
A typical (“Unconscious Melody”) and an atypical (“Setsuyakuka”) song were selected to compare its tempograms. The tempo of Unconscious Melody is mostly constant between 115-120 bpm.
The tempo of the song “Setsuyakuka” is, however, less constant. Although the tempo is relatively pronounced around 90 bpm, it is still highly varied. Interestingly, the tempo between 150-200 seconds seems to be higher than the rest of the song. Listening to the song, it could be possible that the tempogram had a less hard time calculating the tempo, as that same section sounds more quiet than the rest.
Settings: - window_size = 4 - hop_size = 1 - cyclic = TRUE
TODO
TODO